Design and Implementation of an Improved K-Means Clustering Algorithm
نویسندگان
چکیده
Aiming at the problems of traditional K-means clustering algorithm, such as local optimal solution and slow speed caused by uncertainty k value randomness initial cluster center selection, this paper proposes an improved KMeans method. The algorithm first uses idea elbow rule based on sum squares errors to obtain appropriate number clusters k, then variance a measure degree dispersion samples, selects data points with smallest distance greater than average samples set. Finally, combined “triangular inequality principle,” unnecessary calculation in iterative process is reduced, operation efficiency improved. results show that tested UCI Compared k-means Canopy-KMeans accuracy speedup ratio are significantly improved, quality
منابع مشابه
An Improved K-Means with Artificial Bee Colony Algorithm for Clustering Crimes
Crime detection is one of the major issues in the field of criminology. In fact, criminology includes knowing the details of a crime and its intangible relations with the offender. In spite of the enormous amount of data on offenses and offenders, and the complex and intangible semantic relationships between this information, criminology has become one of the most important areas in the field o...
متن کاملAn Efficient k-Means Clustering Algorithm: Analysis and Implementation
ÐIn k-means clustering, we are given a set of n data points in d-dimensional space R and an integer k and the problem is to determine a set of k points in R, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for k-means clustering is Lloyd's algorithm. In this paper, we present a simple and efficient implementation of Llo...
متن کاملPersistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm
Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...
متن کاملAn Improved K-means Algorithm for Clustering Categorical Data
Most of the earlier work on clustering is mainly focused on numerical data the inherent geometric properties of which can be exploited to naturally define distance functions between the data points. However, the computational cost makes most of the previous algorithms unacceptable for clustering very large databases. The k-means algorithm is well known for its efficiency in this respect. At the...
متن کاملAn Improved K-means Clustering Algorithm for Image Segmentation
Image segmentation is a primary step in many computer vision applications, whose purpose is to extract information from the images to allow the discrimination among different objects of interest. This task usually involves the partitioning of the image into a number of clusters, such that the data in each cluster share similar features. This work describes a new clustering algorithm for providi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mobile Information Systems
سال: 2022
ISSN: ['1875-905X', '1574-017X']
DOI: https://doi.org/10.1155/2022/6041484